Search results for "Survey sampling"
showing 10 items of 24 documents
Penalization and data reduction of auxiliary variables in survey sampling
2012
Survey sampling techniques are quite useful in a way to estimate population parameterssuch as the population total when the large dimensional auxiliary data setis available. This thesis deals with the estimation of population total in presenceof ill-conditioned large data set.In the first chapter, we give some basic definitions that will be used in thelater chapters. The Horvitz-Thompson estimator is defined as an estimator whichdoes not use auxiliary variables. Along with, calibration technique is defined toincorporate the auxiliary variables for sake of improvement in the estimation ofpopulation totals for a fixed sample size.The second chapter is a part of a review article about ridge re…
B-Spline Estimation in a Survey Sampling Framework
2021
Nonparametric regression models have been used more and more over the last years to model survey data and incorporate efficiently auxiliary information in order to improve the estimation of totals, means or other study parameters such as Gini index or poverty rate. B-spline nonparametric regression has the benefit of being very flexible in modeling nonlinear survey data while keeping many similarities and properties of the classical linear regression. This method proved to be efficient for deriving a unique system of weights which allowed to estimate in an efficient way and simultaneously many study parameters. Applications on real and simulated survey data showed its high efficiency. This …
Estimating with kernel smoothers the mean of functional data in a finite population setting. A note on variance estimation in presence of partially o…
2014
In the near future, millions of load curves measuring the electricity consumption of French households in small time grids (probably half hours) will be available. All these collected load curves represent a huge amount of information which could be exploited using survey sampling techniques. In particular, the total consumption of a specific cus- tomer group (for example all the customers of an electricity supplier) could be estimated using unequal probability random sampling methods. Unfortunately, data collection may undergo technical problems resulting in missing values. In this paper we study a new estimation method for the mean curve in the presence of missing values which consists in…
Conditional Bias Robust Estimation of the Total of Curve Data by Sampling in a Finite Population: An Illustration on Electricity Load Curves
2020
Abstract For marketing or power grid management purposes, many studies based on the analysis of total electricity consumption curves of groups of customers are now carried out by electricity companies. Aggregated totals or mean load curves are estimated using individual curves measured at fine time grid and collected according to some sampling design. Due to the skewness of the distribution of electricity consumptions, these samples often contain outlying curves which may have an important impact on the usual estimation procedures. We introduce several robust estimators of the total consumption curve which are not sensitive to such outlying curves. These estimators are based on the conditio…
Robust estimation of mean electricity consumption curves by sampling for small areas in presence of missing values
2017
In this thesis, we address the problem of robust estimation of mean or total electricity consumption curves by sampling in a finite population for the entire population and for small areas. We are also interested in estimating mean curves by sampling in presence of partially missing trajectories.Indeed, many studies carried out in the French electricity company EDF, for marketing or power grid management purposes, are based on the analysis of mean or total electricity consumption curves at a fine time scale, for different groups of clients sharing some common characteristics.Because of privacy issues and financial costs, it is not possible to measure the electricity consumption curve of eac…
Effects of diabetes definition on global surveillance of diabetes prevalence and diagnosis: a pooled analysis of 96 population-based studies with 331…
2015
Diabetes has been defined on the basis of different biomarkers, including fasting plasma glucose (FPG), 2-h plasma glucose in an oral glucose tolerance test (2hOGTT), and HbA1c. We assessed the effect of different diagnostic definitions on both the population prevalence of diabetes and the classification of previously undiagnosed individuals as having diabetes versus not having diabetes in a pooled analysis of data from population-based health examination surveys in different regions.
Estimate the mean electricity consumption curve by survey and take auxiliary information into account
2012
In this thesis, we are interested in estimating the mean electricity consumption curve. Since the study variable is functional and storage capacities are limited or transmission cost are high survey sampling techniques are interesting alternatives to signal compression techniques. We extend, in this functional framework, estimation methods that take into account available auxiliary information and that can improve the accuracy of the Horvitz-Thompson estimator of the mean trajectory. The first approach uses the auxiliary information at the estimation stage, the mean curve is estimated using model-assisted estimators with functional linear regression models. The second method involves the au…
A Proposal to estimate the roaming–dog Total in an urban area through a PPSWOR spatial sampling with sample size greater than two
2018
A Three-Dimensional Object Point Process for Detection of Cosmic Filaments
2007
Summary We propose to apply an object point process to delineate filaments of the large scale structure in red shift catalogues automatically. We illustrate the feasibility of the idea on an example of the recent 2dF Galaxy Redshift Survey, describe the procedure and characterize the results.
Correcting for non-ignorable missingness in smoking trends
2015
Data missing not at random (MNAR) is a major challenge in survey sampling. We propose an approach based on registry data to deal with non-ignorable missingness in health examination surveys. The approach relies on follow-up data available from administrative registers several years after the survey. For illustration we use data on smoking prevalence in Finnish National FINRISK study conducted in 1972-1997. The data consist of measured survey information including missingness indicators, register-based background information and register-based time-to-disease survival data. The parameters of missingness mechanism are estimable with these data although the original survey data are MNAR. The u…